Identifying the Most Influential Data Objects with Reverse Top-k Queries

نویسندگان

  • Akrivi Vlachou
  • Christos Doulkeridis
  • Kjetil Nørvåg
  • Yannis Kotidis
چکیده

Top-k queries are widely applied for retrieving a ranked set of the k most interesting objects based on the individual user preferences. As an example, in online marketplaces, customers (users) typically seek a ranked set of products (objects) that satisfy their needs. Reversing top-k queries leads to a query type that instead returns the set of customers that find a product appealing (it belongs to the top-k result set of their preferences). In this paper, we address the challenging problem of processing queries that identify the top-m most influential products to customers, where influence is defined as the cardinality of the reverse top-k result set. This definition of influence is useful for market analysis, since it is directly related to the number of customers that value a particular product and, consequently, to its visibility and impact in the market. Existing techniques require processing a reverse top-k query for each object in the database, which is prohibitively expensive even for databases of moderate size. In contrast, we propose two algorithms, SB and BB , for identifying the most influential objects: SB restricts the candidate set of objects that need to be examined, while BB is a branch-and-bound algorithm that retrieves the result incrementally. Furthermore, we propose meaningful variations of the query for most influential objects that are supported by our algorithms. Our experiments demonstrate the efficiency of our algorithms both for synthetic and real-life datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reporting l most influential objects in uncertain databases based on probabilistic reverse top-k queries

Reverse topk queries are proposed from the perspective of a product manufacturer, which are essential for manufacturers to assess the potential market. However, the existing approaches for reverse topk queries are all based on the assumption that the underlying data are exact (or certain). Due to the intrinsic differences between uncertain and certain data, these methods cannot be applied to pr...

متن کامل

Reverse Engineering Top-k Join Queries

Ranked lists have become a fundamental tool to represent the most important items taken from a large collection of data. Search engines, sports leagues and e-commerce platforms present their results, most successful teams and most popular items in a concise and structured way by making use of ranked lists. This paper introduces the PALEO-J framework which is able to reconstruct top-k database q...

متن کامل

Efficient Reverse Top-k Boolean Spatial Keyword Queries on Road Networks

Reverse k nearest neighbor (RkNN) queries have a broad application base such as decision support, profile-based marketing, and resource allocation. Previous work on RkNN search does not take textual information into consideration or limits to the Euclidean space. In the real world, however, most spatial objects are associated with textual information and lie on road networks. In this paper, we ...

متن کامل

Efficient Processing of Distributed Top-k Queries

Ranking-aware queries, or top-k queries, have received much attention recently in various contexts such as web, multimedia retrieval, relational databases, and distributed systems. Top-k queries play a critical role in many decision-making related activities such as, identifying interesting objects, network monitoring, load balancing, etc. In this paper, we study the ranking aggregation problem...

متن کامل

Estimation of Potential Product Using Reverse Top-k Queries

Atpresent, most of the applications return to the user a limited set of ranked results based on the individual user’s preferences, which are commonly validated through top-k queries. From the perspective of a manufacturer, it is imperative that the products appear in the highest ranked positions for many different user preferences, otherwise the product is not visible to the potential customers...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2010